07. ADF and roots
ADF and Roots of the Characteristic Equation ( very optional but kinda cool!)
The previous explanation of ADF is usually done with a bit more added vocabulary. Normally, you might hear ADF explained as a test for the existence of a unit root. This next section will introduce some math vocabulary, but please remember the main idea of the previous section if this section starts to feel like too much.
If we have an AR(p) model
y_t = \beta_1 y_{t-1} + … + \beta_{t-p} y_{t-p} + \epsilon_t
we can put all the terms that are not the white noise
\epsilon
to the left, like this:
y_t - \beta_1 y_{t-1} - … - \beta_{t-p} y_{t-p} = \epsilon_t.
Then we set the left side of the equation equal to zero. What we have on the left is called the “characteristic equation”. You might recall from learning algebra that when we set an equation equal to zero, we usually are trying to solve for the roots (the values that make the equation equal to zero). Before we can solve for the roots of this equation, we need to rewrite it differently using something called backward shift notation.
Backward shift notation looks like this: B^n y_t = y_{t-n}. So when we see y_{t-1} , we’ll replace it with B^1 y_t . If we see y_{t-2} , we’ll replace it with B^2 y_t . The nice thing about backward shift notation is that we can describe our lags in terms of y_t , which will come in handy in the part that’s coming up.
So we can change this equation:
y_t - \beta_1 y_{t-1} - … - \beta_{t-p} y_{t-p} = 0
Into this:
y_t - \beta_1 (B y_t) - … - \beta_{t-p} (B^p y_t) = 0
Notice how we can now factor out the
y_t
, so we have:
y_t (1 -\beta_1 B - … - \beta_{t-p} B^p) = 0
Okay, let’s look at some examples to see what this means. We saw previously that an AR(1) model with a coefficient of one:
y_t = y_{t-1} + \epsilon_t
is called a random walk, and that a random walk is not stationary. If write the characteristic equation of the random walk, it looks like this:
y_t - y_{t-1} = \epsilon_t = 0
Next, we rewrite it with backward shift notation:
y_t - B y_t = 0
Then we factor out the
y_t
to get:
y_t (1 - B) = 0
And we solve for B to get B = 1. The root equals one, and you might hear people say that the series has a unit root, or that its root “equals unity”.
Next, let’s look at an AR(1) series where the
\beta
coefficient is less than one (let’s say
\beta
is
\frac{1}{2}
).
y_t = \frac{1}{2} y_{t-1} + \epsilon_t
The characteristic equation looks like this:
y_t - \frac{1}{2} y_{t-1} = \epsilon_t = 0
In backward shift notation, it looks like:
y_t - \frac{1}{2} B y_t = 0
Factor out the
y_t
:
y_t(1 - \frac{1}{2} B) = 0
Solving for B is solving for the unit root of the characteristic equation. So we get 1 = \frac{1}{2} B , and so B = 2. Since the root is greater than one, we can say that the series is stationary.
Note that for series with more than one lag, we can solve for more than one root.
The Augmented Dickey Fuller Test has a null hypothesis that a series has a unit root. In other words, the null hypothesis is that a series is a random walk, which is not stationary. The alternate hypothesis is that the roots of the series are all greater than one, which suggests that the series is stationary. If the ADF gives a p-value of 0.05 or less, we reject the null hypothesis and can assume that the series is stationary.
Engle-Granger Test
The Engle Granger Test is used to check whether two series are cointegrated. It involves two steps. First, calculate the hedge ratio by running a regression on one series against the other
y_t = \beta x_t
We call the
\beta
the “hedge ratio”.
Second, we take y_t - \beta x_t to create a series that may be stationary. We’ll call this new series z_t . Then we use the ADF test to check if that series z_t is stationary. If z_t is stationary, we can assume that the x and y series are cointegrated.